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6.3 Multigrid Methods 



The Jacobi and Gauss-Seidel iterations produce smooth errors. The error vector e 
has its high frequencies nearly removed in a few iterations. But low frequencies are 
reduced very slowly. Convergence requires 0(N 2 ) iterations — which can be unaccept- 
able. The extremely effective multigrid idea is to change to a coarser grid, on which 
"smooth becomes rough" and low frequencies act like higher frequencies. 

On that coarser grid a big piece of the error is removable. We iterate only a few 
times before changing from fine to coarse and coarse to fine. The remarkable result 
is that multigrid can solve many sparse and realistic systems to high accuracy in a 
fixed number of iterations, not growing with n. 

Multigrid is especially successful for symmetric systems. The key new ingredients 
are the (rectangular!) matrices R and / that change grids: 



1. A restriction matrix R transfers vectors from the fine grid to the coarse grid. 

2. The return step to the fine grid is by an interpolation matrix I = I^ h . 

3. The original matrix Ah on the fine grid is approximated by A^h = RA^I on 
the coarse grid. You will see how this A 2 h is smaller and easier and faster than 
Ah- I will start with interpolation (a 7 by 3 matrix / that takes 3 v's to 7 u's): 



Interpolation Iv = u 
u on the fine (h) grid from 
v on the coarse (2h) grid 
values are the w's. 
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This example has h = | on the interval < x < 1 with zero boundary conditions. 
The seven interior values are the u's. The grid with 2h = \ has three interior u's. 

Notice that u 2 , «4, u e from rows 2, 4, 6 are the same as Vx, v 2 , v 3 ! Those coarse grid 
values Vj are just moved to the fine grid at the points x — |, |, |. The in-between 
values «i, W3, W5, W7 on the fine grid are coming from linear interpolation between 
0,v 1 ,v 2 ,v 3 ,0: 



Linear interpolation in rows 1, 3, 5, 7 u 2 j+i 



-(Vj + V j+lj 



(2) 



The odd-numbered rows of the interpolation matrix have entries | and |. We almost 
always use grid spacings h,2h,4:h, . . . with the convenient ratio 2. Other matrices I 
are possible, but linear interpolation is easy and effective. Figure 6.10a shows the 
new values «2j+i (open circles) between the transferred values u 2 j = Vj (solid circles). 
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U 2 = Vi 
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(a) Linear interpolation by u = I^v (b) Restriction by B? h u = ^(hhJ 

Figure 6.10: Interpolation to the h grid (7 w's). Restriction to the 2h grid (3 v's 



u 



When the u's represent smooth errors on the coarse grid (because Jacobi or Gauss- 
Seidel has been applied on that grid), interpolation gives a good approximation to 
the errors on the fine grid. A practical code can use 8 or 10 grids. 

The second matrix we need is a restriction matrix Rfr- It transfers u on a 
fine grid to v on a coarse grid. One possibility is the one-zero "injection matrix" that 
simply copies v from the values of u at the same points on the fine grid. This ignores 
the odd-numbered fine grid values «2j+i- Another possibility (which we adopt) is the 
full weighting operator R that comes from transposing I^ h . 

Fine grid h to coarse grid 2h by a restriction matrix Rff 1 = |(^2h) T 



Full weighting Ru = v 1 
Fine grid u to coarse grid v 4 
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(3) 



The effect of this restriction matrix is shown in Figure 6.10b. We intentionally chose 
the special case in which Uj = sin(2j7r/8) on the fine grid (open circles). Then v on 
the coarse grid (dark circles) is also a pure sine vector. But the frequency is doubled 
(a full cycle in 4 steps). So a smooth oscillation on the fine grid becomes "half as 
smooth" on the coarse grid, which is the effect we wanted. 



Interpolation and Restriction in Two Dimensions 

Coarse grid to fine grid in two dimensions from bilinear interpolation: Start 
with values Vij on a square or rectangular coarse grid. Interpolate to fill in Uij by a 
sweep (interpolation) in one direction followed by a sweep in the other direction. We 
could allow two spacings h x and h y , but one meshwidth h is easier to visualize. A 
horizontal sweep along row i of the coarse grid (which is row 2i of the fine grid) will 
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fill in values of u at odd-numbered columns 2j + 1 of the fine grid: 

Horizontal sweep W2i.2j — Vij and W2j,2j+i — ~{ v i,j + v i,j+i) as i n ID- (4) 

Now sweep vertically, up each column of the fine grid. Interpolation will keep those 
values (4) on even- numbered rows 2%. It will average those values to find u = J2D v 
on the fine-grid odd- numbered rows 2% + 1: 

Vertical sweep u 2 i+i,2j = {v id + v i+ i d )/2 

(5) 

Averages of (4) w 2 i+i,2j+i = Om + v i+l ,j + v i>j+1 + v i+1> j +1 )/A . 

The entries in the tall thin coarse-to-fine interpolation matrix J2D are 1, |, and |. 

The full weighting fine-to-coarse restriction operator R2D is the transpose J2D T , 
multiplied by |. That factor is needed (like | in one dimension) so that a constant 
vector of l's will be restricted to a constant vector of l's. (The entries along each row 
of the wide matrix R add to 1.) This restriction matrix has entries j, |, and and 
each coarse-grid value v is a weighted average of nine fine-grid values u: 



Restriction matrix R = \ I T X / 16 * f^-f 
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You can see how a sweep along each row with weights |, |, |, followed by a sweep 
down each column, gives the nine coefficients in that "restriction molecule." Its 
matrix R2D is an example of a tensor product or Kronecker product kron(i?, R) . A 
3 by 7 matrix R in one dimension becomes a 9 by 49 restriction matrix R2D in two 
dimensions. 

Now we can transfer vectors between grids. We are ready for the geometric 
multigrid method, when the geometry is based on spacings h and 2h and 4h. The 
idea extends to triangular elements (each triangle splits naturally into four similar 
triangles). The geometry can be more complicated than our model on a square. 

When the geometry becomes too difficult, or we are just given a matrix, we turn 
(in the final paragraph) to algebraic multigrid. This will imitate the multi-scale 
idea, but it works directly with Au = b and not with any underlying geometric grid. 



A Two-Grid V-Cycle (a v-cycle) 

Our first multigrid method only involves two grids. The iterations on each grid can 
use Jacobi's / — D~ X A (possibly weighted by uj = 2/3 as in the previous section) or 
Gauss-Seidel. For the larger problem on the fine grid, iteration converges slowly to 
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the low frequency smooth part of the solution u. The multigrid method transfers the 
current residual r^ = b — Auh to the coarse grid. We iterate a few times on that 2h 
grid, to approximate the coarse-grid error by E 2 h- Then interpolate back to Eh on 
the fine grid, make the correction to Uh + Eh, and begin again. 

This fine-coarse- fine loop is a two-grid V-cycle. We call it a v-cycle (small v). 
Here are the steps (remember, the error solves Ah{u — tt/J = bh — AhUh = r^): 



1. 


Iterate on A^u = bh to reach 


Uh (say 3 Jacobi or Gauss-Seidel steps). 


2. 


Restrict the residual = bh 


— A h Uh to the coarse grid by r 2h = R\ h r h 


3. 


Solve A 2h E 2h = r 2h (or come close to E 2 h by 3 iterations from E = 0). 


4. 


Interpolate E 2 h back to E h 


= l 2h E 2h . Add E h to Uh, 


5. 


Iterate 3 more times on AhU 


= bh starting from the improved Uh + E h . 



Steps 2-3-4 give the restriction-coarse solution-interpolation sequence that is the 
heart of multigrid. Recall the three matrices we are working with: 



A = Ah = original matrix 



— »2h _ 



R = R 



_ Th _ 

— 1 2h — 



restriction matrix 
interpolation matrix . 



Step 3 involves a fourth matrix A 2 h, to be defined now. A 2 h is square and it is smaller 
than the original Ah, In words, we want to "project" the larger matrix Ah onto the 
coarse grid. There is a natural choice ! The variationally correct A 2 h comes directly 
and beautifully from R and A and /: 



The coarse grid matrix is A 2 h = itf^Ahl^h = RAI . 



(6) 



When the fine grid has N = 7 interior meshpoints, the matrix Ah is 7 by 7. Then the 
coarse grid matrix RAI is (3 by 7) (7 by 7) (7 by 3) = 3 by 3. 



Ah might be the second difference matrix K/h 2 . Our 
4, so that multigrid goes from five 



Example In one dimension, A 
first example came from h 
meshpoints inside < x < 1 to two meshpoints (/ is 5 by 2 and R is 2 by 5): The neat 
multiplication (we will use it again later) is RA h = RK 5 /h 2 
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A natural choice for A 2 h on the coarse grid is K 2 /(2h) 2 and multigrid makes this choice: 



Coarse grid matrix 
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2h 
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The reader will appreciate that the I T AI rule preserves symmetry and positive definiteness, 
when A has those properties. The rule arises naturally in Galerkin methods [-.page -], 
including the finite element method. Notice how the restriction operator R with the 
factor | automatically adjusts 1/h 2 to l/(2h) 2 . 

Steps 1 and 5 are necessary, but they are really outside the essential multigrid 
idea. The smoother is step 1, the post-smoother is step 5. Those are normal 
iterations for which weighted Jacobi or Gauss-Seidel is satisfactory. 



The Errors and Eh 

Suppose we solve the coarse grid equation exactly at step 3. Is the multigrid error 
correction E h then equal to the true fine-grid error e h = u — u^l No, that is too 
much to expect ! We have only solved the smaller problem on the coarse grid, not 
the full problem. But the connection between Eh and eh is simple and crucial for 
understanding multigrid. We now track down the steps from E to e. 

Four matrices multiply e. At step 2, the residual r = b — Auh = A(u — Uh) = Ae 
multiplies by A. The restriction is multiplication by R. The solution step 3 multiplies 
by A^h = (RAI)^ 1 . The interpolation step 4 multiplies by / to find the correction 
E. Altogether, E is IA^iRA h e: 

E = I(RAI)~ l RAe and we call this E = Se. (9) 

When / is 5 by 2 and R is 2 by 5, that matrix S on the right side is 5 by 5. It 
can't be the identity matrix, since RAI and its inverse are only 2 by 2 (rank two). 
But S = I(RAI)~ 1 RA has the remarkable property S 2 = S. This says that S is the 
identity matrix on its 2- dimensional column space. (Of course S is the zero matrix 
on its 3-dimensional nullspace.) S 2 = S is easy to check: 

S 2 = (I(RAI)- 1 RA)(I(RAI)- 1 RA) = S because (RAI)- 1 RAI disappears. (10) 

So the multigrid correction E = Se is not the whole error e, it is a projection of e. 
The new error is e — E = e — Se = (I — S)e. This matrix I — S is the two- grid 
operator. I — S plays the same fundamental role in describing the multigrid steps 
2—4 that the usual M = I — P~ x A plays for each iteration in steps 1 and 5: 

v-cycle matrix = I — S iteration matrix = I — P _1 A . 
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Example (continued). The 5 by 5 matrix A h 
led in (8) to A 2h = K 2 /(2h) 2 . To find S 



IA 2 hRA h 



K 5 /h 2 and the rectangular I and R 
we multiply (7) by IA^: 



A 2 hRA h 
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Now multiply by / to find S 
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The eigenvalues of this S are 1,1,0,0,0. If you square S, you recover S 2 = S. With 
its three columns of zeros, the nullspace of S contains all fine-grid vectors of the form 
(ei, 0, 63, 0, es). Those are vectors that don't appear on the coarse grid. If the error e had 
this form, then E = Se would be zero (no improvement from multigrid). But we don't 
expect a large component of those high frequency vectors in e, because of the smoothing. 

The column space of S contains column 2 = (~, 1, ~, 0, 0) and column 4 = (0, 0, \, 1, \). 
These are "mixed-frequency vectors." We do expect them to appear in e, because the 
smoothing step didn't remove them. But these are vectors for which E = Se = e and 
they are the errors that multigrid catches ! After step 4 they are gone. 

Let me say this in another way. Because S 2 = S, the only eigenvectors are A = 
and A = 1. (If Su = Xu we always have S 2 u = \ 2 u. Then S 2 = S gives A 2 = A.) Our 
example has A = 1, 1, 0, 0, 0. The eigenvalues of I — S are 0, 0, 1, 1, 1. The eigenvectors 
e reveal what multigrid is doing: 

E = Se = In this case multigrid gives no improvement. The correction E added 
to u h in step 4 is zero. IN the example, this happens for errors e = (ei, 0, e 3 , 0, e 5 ) 
that are zero on the coarse grid where step 3 is working. 

E = Se = e In this case multigrid is perfect. The correction Eh added to Uh in 
step 4 is the whole error e/,. In the example, two eigenvectors of S for A = 1 are e = 
(1,2,2,2,1) and e = (1,2,0,-2,-1). Those have large low-frequency components. 
They oscillate up and down only once and twice. They are in the column space of 
/. They are no perfect sines, but an important part of the low-frequency error is 
caught and removed. The number of independent vectors with Se = e is the number 
of coarse gridpoints (here 2). That measures the A 2 ^ problem that step 3 deals with. 
It is the rank of S and also /. The other 5 — 2 gridpoints account for the nullspace 
of S, where E = Se = means no improvement from multigrid. 

Note The "high-frequency" vectors (ui, 0, U3, 0, U5) with Su = are not exactly 
combinations of the last three discrete sines 2/3,1/4,1/5. The frequencies are mixed by 
S, as equations (18-19) will clearly show. The exact statements are column space of S 
= column space of I and nullspace of S = nullspace of RA. The mixing of frequencies 
does not affect our main point: Iteration handles the high frequencies and 
multigrid handles the low frequencies. 

You can see that a perfect smoother followed by perfect multigrid (exact solution 
at step 3) would leave no error. In reality, this will not happen. Fortunately, a careful 
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(not so simple) analysis will show that a multigrid cycle with good smoothing can 
reduce the error by a constant factor p that is independent of h: 

||error after step 5|| < p||error before step 1|| with p < 1. (12) 

A typical value is p = Compare with p = .99 for Jacobi alone. This is the 
Holy Grail of numerical analysis, to achieve a convergence factor p (a spectral radius 
of the overall iteration matrix) that does not move up to 1 as h — > 0. We can achieve 
a given relative accuracy in a fixed number of cycles. Since each step of each cycle 
requires only 0(n) operations on sparse problems of size n, multigrid is an 0(n) 
algorithm. This does not change in higher dimensions. 

There is a further point about the number of steps and the accuracy. The user 
may want the solution error e to be as small as the discretization error (when the 
original differential equation was replaced by Au = b). In our examples with second 
differences, this demands that we continue until e = 0(h 2 ) = 0(N~ 2 ). In that case 
we need more than a fixed number of v-cycles. To reach p k = 0(N~ 2 ) requires 
k = O(logiV) cycles. Multigrid has an answer for this too. 

Instead of repeating v-cycles, or nesting them into V-cycles or W-cycles, it is 
better to use full multigrid: FMG cycles are described below. Then the operation 
count comes back to 0(n) even for this higher required accuracy e = 0(h 2 ). 

V- Cycles and W- Cycles and Full Multigrid 

Clearly multigrid need not stop at two grids. If it did stop, it would miss the remark- 
able power of the idea. The lowest frequency is still low on the 2h grid, and that part 
of the error won't decay quickly until we move to Ah or 8h (or a very coarse 512/i). 

The two-grid v-cycle extends in a natural way to more grids. It can go down to 
coarser grids (2h, Ah, Sh) and back up to (Ah, 2h, h). This nested sequence of v-cycles 
is a V-cycle (capital V). Don't forget that coarse grid sweeps are much faster than 
fine grid sweeps. Analysis shows that time is well spent on the coarse grids. So the 
W-cycle that stays coarse longer (Figure 6.11b) is generally superior to a V-cycle. 




Figure 6.11: V-cycles and W-cycles and FMG use several grids several times. 

The full multigrid cycle in Figure 6.11c is asymptotically better than V or W. 
Full multigrid starts on the coarsest grid. The solution on the 8h grid is interpolated 
to provide a good initial vector u^h on the Ah grid. A v-cycle between Ah and 8h 
improves it. Then interpolation predicts the solution on the 2h grid, and a deeper 
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V-cycle makes it better (using 2h,Ah,8h). Interpolation of that improved solution 
onto the finest grid gives an excellent start to the last and deepest V-cycle. 

The operation counts for a deep V-cycle and for full multigrid are certainly greater 
than for a two-grid v-cycle, but only by a constant factor. That is because the count 
is divided by a power of 2 every time we move to a coarser grid. For a differential 
equation in d space dimensions, we divide by 2 d . The cost of a V-cycle (as deep as 
we want) is less than a fixed multiple of the v-cycle cost: 



1 / i \ 2 2 d 
V-cycle cost < (l + ^d+(^d) + ' ' ' ) v-cycle cost = — d — - v-cycle cost 



(13) 



Full multigrid is no more than a series of inverted V-cycles, beginning on a very coarse 
mesh. By the same reasoning that led to (13), 



2 



Full multigrid cost < — — -V-cycle cost < y— — -J v-cycle cost. (14) 
And the method works in practice. But good programming is required. 



Multigrid Matrices 

For a 3-grid V-cycle, what matrix S3 corresponds to the 2-grid v-cycle projection 
S = I{RAI)~ 1 RA1 No smoothing is involved here. S and S3 only project to coarser 
problems. By itself, S3 will not be a good solver without smoothers. 

To construct A^, replace the matrix A 2 h = RAI on the middle grid by using a 
coarse restriction R c = R\\ that transfers down to the Ah grid, and an interpolation 
Ic — I"lh that comes back to 2h: 

Very coarse matrix A 4h = R c A 2h I c . (15) 

If h — ^j, that product is (3 x 7) (7 x 7) (7 x 3). The 3 by 3 problem uses on the 
Ah — h grid (with three interior unknowns). Then S3 = (S3) 2 is the matrix that goes 
down two grids, solves the very coarse problem by A^, and comes back up: 

E 3 = S 3 e = Error removed S 3 = I^QA^R^RfA . (16) 

The error that remains after an unsmoothed V-cycle will be (J — Ss)e. On the h = 
grid, there are 15 frequencies in the error e. Only the lowest 3 are (approximately) 
removed by S3C We have only solved a 3 by 3 problem with A^. It is the smoothers, 
on the fine h grid and the middle 2h grid, that reduce the high and middle frequency 
errors in the solution. 

Note to myself: S 4 / l = I c A^R c A 2 h = I C A^R C RAI = 7x7 has / on right instead 
of left. Still Sj h = S ih . 
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Numerical Experiments 

The real test of multigrid effectiveness is numerical ! Its fc-step approximation 
should approach zero and the graphs will show how quickly this happens. An initial 
guess Uq includes an error e . Whatever iteration we use, we are trying to drive uy. to 
u, and Ck to zero. The multigrid method jumps between two or more grids, so as to 
converge more quickly, and our graphs will always show the error on the fine grid. 

We can work with the equation Ae = 0, whose solution is e = 0. Following [-], 
Figure 6. a displays an initial error with low and high frequencies: 

( e o)j = sin 4jnh + sin20j7r/i with h = — . 

32 

We draw continuous functions sin47rx and sin207rx rather than 31 discrete values. 

Figure 6. b shows the error vector e% after three fine-grid sweeps of weighted 

Jacobi (with lu — |). As expected, the high frequency component has greatly decayed. 
The error e% — (I — P~ 1 A) 3 eo is much smoother than cq. For our second difference 
matrix (better known as K), the Jacobi preconditioner from Section 6.2 simply 
has P~ l = \u)I. We can ignore the factors h 2 that divide K and P, because they 
cancel. 

Figure 6.12: The initial error and the error e% after three weighted Jacobi relaxations. 

Now multigrid begins. The current fine-grid residual is = — A^e^. After restric- 
tion to the coarse grid it becomes r 2 h- Three weighted Jacobi iterations on the coarse 
grid error equation A 2h E 2h = r 2 h start with the guess E 2h = 0. That produces... 

ADD V-CYCLE AND FMG EXPERIMENTS 

Figure 6.13: The fine-grid error after 3 sweeps on the coarse grid (a) and then the 
fine grid (b). 



Eigenvector Analysis 

The reader will recognize that one matrix (like / — S for a v-cycle) can describe each 
multigrid method. The eigenvalues of a full multigrid matrix would be nice to know, 
but they are usually impractical to find. Numerical experiments build confidence. 
Computation also provides a diagnostic tool, to locate where convergence is stalled 
and a change is needed (often in the boundary conditions). If we want a predictive 
tool, the best is modal analysis. 

The key idea is to watch the Fourier modes. In our example those are discrete 
sines, because of the boundary conditions u(0) = u(l) = 0. We will push this model 
problem all the way, to see what the multigrid matrix I — S does to those sine vectors. 
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The final result in (18-19) shows why multigrid works and it also shows how pairs of 
frequencies are mixed. The eigenvectors are mixtures of two frequencies. 

The frequencies that mix together are k and N + 1 — k. The discrete sines with 
those frequencies are y k and Y k : 
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kn 



1kit 



sm 



-, sm 



N+V N+V 



and Yu 



(N+l-k)n 2(N+l-k)n 
sin — — : , sin 



N+1 



N+1 



Where y k goes up and down k times, Y k does that N + 1 — k times. 

There is something neat you have to see. and y^ have the same compo- 
nents except for alternating signs ! Cancel N+1 with N+1 in those components 
of Y k : 
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(17) 



For our model problem with second differences, we can report the result of multiplying 
by S = I(RAI)~ 1 RA. In fact the true multigrid matrix is I — S, to give the error 
e — E = (I — S) e that remains after steps 2, 3, 4. Seeing I — S is even better: 



One v-cycle (2k < N+1) (I - S) y k 

Smooth errors reduced 

Frequencies mixed (/ — S)Y k 



— ( 1 — cos ] (yu + Y k i 

2 V iV+1 ' v 



If kn 
2 V 1 + C0S iV+l ^ Vk + Yk ' 



(18) 
(19) 



Such beautiful formulas can't be expected in more complicated problems, and we 
seize this chance to make four key points: 

1. Low frequencies like k = 1, 2, 3 are greatly reduced by multigrid. The 

cosine in equation (18) will be near 1. The factor (1 — cos j^t^) is very small, 
of order (k/N) 2 . This shows how multigrid is powerful in nearly killing the low 
frequencies. These are exactly the frequencies on which Jacobi and Gauss-Seidel 
will stall in the smoothing iterations. 

2. The pair of sines and 1^ is mixed together by multigrid. We will 

see that the restriction R and interpolation I are responsible. Those are not 
like the square matrix A, which keeps frequencies separate (because the sines 
are its eigenvectors). Aliasing appears for rectangular matrices! 

3. The combinations e = y^ + are eigenvectors of I — S with A = 1. 

Just add equations (18) and (19), to get (/ — S)e = e. Since Y k has the 
same components as y k with alternating signs, y k + Y k has the correct form 
(ei, 0, e 3 , 0, . . .). Those are the vectors that we discovered earlier in the nullspace 
(A = 0) of S. They are not touched by / — S since Se = 0. 
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4. The other eigenvectors of I — S are just Sy^. We know that (I—S)Sy k = 
because S = S 2 . In our example with N = 5, the vectors Sy\ and Sy2 are 
multiples of (1, 2, 2, 2, 1) and (1, 2, 0, —2, 1) that we found explicitly. Multigrid 
removes them from the error. 

According to (18), these vectors Sy k are combinations of y k and Y k . To find a good 
combination, multiply (18) and (19) by (1 + cos and (1 — cos j^f). The right- 
hand sides are now the same and we subtract: 

S* = e* (I-S)e* = (I-S) 

These mixtures e* in square brackets are completely killed (A = 0) by multigrid. 
A smooth error vector (after Jacobi iterations have reduced its high frequency 
components) has only small components along the eigenvectors y k + Y k . Multigrid 
doesn't touch those pieces (A = 1). The larger components along the Z k will die. 

The Restriction R Produces Aliasing 

To complete this analysis we have to see where and how a pair of frequencies is mixed. 
The aliasing comes from the restriction matrix R = Rh, when both vectors y k and 
Y k h lead to multiples of the same output vector y 2 ! 1 : 

«* - G ~\ cos Mi) * and RY " = j£r) # • < 21 > 

You see the aliasing by R. We cannot hope to decide the input y^ or Y k from these 
outputs. This is normal for a short wide matrix. (The matrix R is 3 by 7 and 2 by 5 
in our examples.) The coarse mesh output has only about half as many components 
as the fine mesh input. 

The transpose of R does the opposite. Where R mixes two inputs y\ and Y^ into 
one output, the interpolation matrix / sends one input frequency on the coarse grid 
into a pair of frequencies on the fine grid: 

2 lW k h = (l + cos j^j y h k - (l - cos j^j Y k h . (22) 

Interpolation of a smooth vector (low k) on a coarse grid will excite an oscillatory 
mode Y k (high k) on the fine grid. But those oscillations have small amplitude, 
because the cosine of fc7r/(iV+l) is near 1. 

The key formulas (18) and (19) that describe multigrid come from assembling (21) 
for R, (22) for J, and the known eigenvalues for Ah and A2h- I think the calculation of 
S in (11) shows this best. Its zero columns put y k + Y k in its nullspace. The nonzero 
columns in S come from the interpolation matrix /. So Se captures a part of the 
whole error e. Multigrid solves a projection of the original problem. 



J/T 



COS 



N+l 



Vk 



cos 



kn 
N+l 



Y k 



(20) 



6.3. MULTIGRID METHODS 



©2006 Gilbert Strang 



Example completed It would be useless to repeat steps 2, 3, 4 with no smoothing in 
between. Nothing would change! The untouched vectors e = yk + Yk with (I — S)e = e 
will still be untouched. It is the smoothing matrix M = I — P~ l A that must reduce these 
errors. 

If we apply Jacobi with weight w — | at steps 1 and 5, then M = I — |A The overall 
matrix for all steps 1 to 5 will be M (I — S)M. The eigenvalues of that matrix will decide 
the success (or not) of multigrid. To my amazement, the 5 by 5 matrix M(I — S)M has 
a triple eigenvalue of | ! 

The three eigenvalues A = 1 of I — S are reduced to A = — for M(I — S)M. 

9 

The largest eigenvalue of M is .91 — you see the value of multigrid. I hope you will try 
eig(M * M * (I - S) * M * M) with double smoothing (see Problems 9-12). 

Fourier Modal Analysis 

Pure modal analysis neglects the boundaries completely. It assumes an infinite grid ! 
The vectors yk in the example were sines because of the boundary conditions. When 
boundaries are gone, the yk are replaced by infinitely long vectors y^ coming from 
complex exponentials. Now there is a continuum of frequencies uj: 

Vuj = (..., e~ 2iuj , e - *", 1, e*", e 2iuJ , . . .) with - tt < uj < tt . (23) 

We need infinite matrices Koo to multiply these infinite vectors. Second differences 
—1,2,-1 appear on all rows forever. The key is that each y^ is an eigenvector of K^, 
with eigenvalue A = 2 — 2 cos uj: 

K oo y L0 = (2-2 cos uj) Vlu because - e iu){n+1) - e ^ (n - 1} = -2 cos uj e ibjn . (24) 

This tells us the action of = K^/h 2 . It also tells us about A 2 h, when the coarse 
mesh changes h 2 to (2h) 2 and uj to 2uj. 

The restriction matrix R still introduces aliasing. The frequencies that mix to- 
gether are now uj and uj + tt. Notice how increasing uj by it produces a factor e mn = 
(— 1)" with alternating signs. This is exactly what we saw for yk and Yk. This time 
it is yuj — yuj+n that has all zeros in its even-numbered components. 

This pure Fourier analysis will go all the way to formulas for the infinite (/ — S)y u 
and (J— S)y UJ + 7r , just like equations (18) and (19). In that finite case, those equations 
explained why multigrid succeeds. They do the same in the infinite case. 

Let me emphasize why this pure modal analysis (with no boundaries and constant 
coefficients) was mentioned. It allows Fourier to work freely. The differential equation 
div(c(x, y) gradw) = f(x,y) on a general region would lead to giant complications in 
the eigenvectors for multigrid — impossible to find them. But if we fix c = constant 
and ignore boundaries, those "interior eigenvectors" return to simple combinations of 
y^j and yuj+ir- That leaves difficulties associated with the boundary conditions, which 
Fourier doesn't easily resolve. 
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Algebraic Multigrid 

We close this section with a few words about algebraic multigrid, when the problem 
comes as a system of equations Au = b. There is no grid in the background. We 
need to find replacements for the key ideas on which geometric multigrid was based: 
smooth vectors, connected nodes, coarse meshes. Replacing the first two is fairly easy, 
rescaling Ah to A^h on a (nonexistent) coarse mesh is less clear. 

1. Smooth vectors. In geometric multigrid these are vectors for which the 
norms of u and Au are comparable. That will be our indication of a smooth 
vector. High frequencies in u would be greatly amplified by A (just as the 
second derivative amplifies sinfct by k 2 ). For low frequencies, Multigrid works 
on smooth errors and Jacobi doesn't. 

2. Connected nodes. On a grid, neighboring nodes are reflected by a nonzero 
entry in A. When there is no grid, we look directly at the matrix. Its significant 
nonzero entries Ay- (magnitude greater than a An, say with a = ^) tell us when 
we should think of the "nodes" i and j as being connected. 

3. Coarse subset of nodes. Each significant entry Ay indicates that the value 
of Uj strongly influences the value of Uj. Probably the errors ej and are 
comparable, when the error is smooth. We don't need both i and j in the 
"coarse set C" of nodes. On a grid they would be neighbors, not both in C. 

But if i is not in C, then every j that strongly influences i should either be in 
C or be strongly influenced by another J that is in C. This heuristic rule is 
discussed more fully in [— ], with an algorithm for constructing C. (In general, 
too many coarse unknowns are better than too few.) 

The excellent book [-] also constructs the coarse-to-fine interpolation matrix I. 
This starts with the errors Ej for j in C, and leaves them unchanged. If i is not in 
C, the interpolated value Ei at step 2 of multigrid will be a weighted combination of 
the Ej that do have j in C . In our one-dimensional model problem, that weighted 
combination was the average of the two neighbors of i. That model problem certainly 
had a grid ! 

The interpolating combination will give greatest weight to the ej for which j 
in C strongly influences i. But there may be smaller entries Ay that cannot be 
completely ignored. The final decision on the weights for each interpolated value 
is more subtle than a simple average. Algebraic multigrid is more expensive than 
geometric multigrid, but it applies to a much wider range of sparse matrices A (and 
the software can control AMG without us). We still expect a "smoothing + multigrid" 
combination that is close to 0(n) steps for an accurate solution of Au = b. 

Let me mention an important contrast between solid mechanics and fluid mechan- 
ics. For solids, the original finite element grid is already relatively coarse. We are 
typically looking for "engineering accuracy" and we don't have fine scale motions to 
resolve. Multigrid is not such a common choice for structural problems (elimination 
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is simpler). Fluids do have fine scales so that multigrid becomes a natural idea, not 
only numerically but physically. Of course fluids don't generally present symmetric 
matrices, because of the convective terms, and finite element methods may require 
"upwind adjustments." The analysis of multigrid convergence becomes harder for 
fluids, just when a multiscale approach becomes attractive. 



Problem Set 6.3 



1 What is the 3 by 7 matrix -Rejection that copies Vx,v 2 , v 3 from u 2 ,u^u§ and 
ignores u±, u 3 , u 5 , u 7 ? 

2 Write down a 9 by 4 bilinear interpolation matrix / that uses the four grid 
values at the corners of a square (side 2h) to produce nine values at the (h) 
gridpoints. What constant should multiply the transpose of / to give a one- 
element restriction matrix R ? 

3 In Problem 2, the four small squares (side h) subdivide into 16 smaller squares 
(side h/2). How many rows and columns in the interpolation matrix 7^/ 2 ? 

4 If A is the 5-point discrete Laplace matrix (with —1, —1,4, —1, —1 on a typical 
row) what is a typical row of A 2 h = RAI using bilinear interpolation as in 
Problems 2-3 ? 

5 Suppose A comes from the 9-point stencil (8/3 surrounded by eight entries of 
—1/3). What is now the stencil for A 2 h = RAI ? 

6 Verify Ry% = ^(1 + cos -^i)y'i h in equation (23) for the linear restriction matrix 
R applied to discrete sines y\ with k < |(iV+l). 

7 Show that RYj} = ~(— 1 — cos jj^)y 2h in equation (23) for the "complementary" 
discrete sines Y£ = y N +i-k- Now N + 1 — k > |(iV+l). 

8 Verify equation (24) for linear interpolation applied to the discrete sines y\ with 
k<±(N+l). 

9 With h = |, use the 7 by 3 and 3 by 7 matrices I and R in equations (1-2) to 
find the 3 by 3 matrix A 2 h = RAI with A = K 7 /h 2 . 

10 Continue Problem 9 to find the 7 by 7 matrix S = I(RAI)~ l RA. Verify that 
S 2 = S, and that S has the same nullspace as RA and the same column space 
as I. What are the seven eigenvalues of S ? 

11 Continue Problem 10 to find (by MATLAB) the multigrid matrix I — S and 
the presmoothed/postsmoothed matrix MSM, where M is the Jacobi smoother 
/ - uD^A = I - \K 7 with u = §. Find the eigenvalues of MSM ! 

12 Continue Problem 11 to find the eigenvalues of M 2 SM 2 with two Jacobi smoothers 
(u = |) before and after the grid changes. 
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13 With unweighted Jacobi (uo = 1 and M = I — \K 7 ) find MSM and its eigen- 
values. Is the weighting useful ? 

14 Starting from h = ^ and the second difference matrix A = Ki 5 /h 2 , compute 
the projected 7 by 7 matrix A 2h = RAI and the doubly projected 3 by 3 matrix 
A±h = R c A 2 hI c - Use linear interpolation matrices I and I c , and restriction 
matrices R— |/ T and i? c = 

15 Use A^h from Problem 14 to compute the projection matrix 5*3 in (16). Verify 
that this 15 by 15 matrix has rank 3, and that (S3) 2 = 5*3. 

16 The weighted Jacobi smoothers are M h = I 15 — |i^i5 and M 2 h = h ~ \K 7 . With 
MATLAB, compute the smoothed error reduction matrix V3 and its eigenvalues: 

V 3 = M h IM 2h (I 7 - S 2h )M 2h RAM h . 

S2h — IcA^R c A 2 h is the projection matrix for the v-cycle within the V-cycle. 

17 Compute the (15 x 7) (7 x 3) linear interpolation matrix I^Ah- What is the 
restriction matrix? 



